Markov Decision Processes with Constrained Stopping Times
نویسندگان
چکیده
The optimization problem for a stopped Markov decision process is considered to be taken over stopping times constrained so that E 5 for some xed > 0. We introduce the concept of a randomized stationary stopping time which is a mixed extension of the entry time of a stopping region and prove the existence of an optimal constrained pair of stationary policy and stopping time by utilizing a Lagrange multiplier approach. Also, applying the idea of the onestep look ahead (OLA) policy the optimal constrained pair is sought concretely. As an example, constrained Markov deteriorating system is explained.
منابع مشابه
Variance minimization for constrained discounted continuous-time MDPs with exponentially distributed stopping times
This paper deals with minimization of the variances of the total discounted costs for constrained Continuous-Time Markov Decision Processes (CTMDPs). The costs consist of cumulative costs incurred between jumps and instant costs incurred at jump epochs. We interpret discounting as an exponentially distributed stopping time. According to existing theory, for the expected total discounted costs o...
متن کاملRisk-Constrained Reinforcement Learning with Percentile Risk Criteria
In many sequential decision-making problems one is interested in minimizing an expected cumulative cost while taking into account risk, i.e., increased awareness of events of small probability and high consequences. Accordingly, the objective of this paper is to present efficient reinforcement learning algorithms for risk-constrained Markov decision processes (MDPs), where risk is represented v...
متن کاملFinite Horizon Optimal Stopping of Time-Discontinuous Functionals with Applications to Impulse Control with Delay
We study finite horizon optimal stopping problems for continuous time Feller-Markov processes. The functional depends on time, state and external parameters, and may exhibit discontinuities with respect to the time-variable. Both left and right-hand discontinuities are considered. We investigate the dependence of the value function on the parameters, initial state of the process and on the stop...
متن کاملOptimal Starting-Stopping Problems for Markov-Feller Processes
By means of nested inequalities in semigroup form we give a characterization of the value functions of the starting-stopping problem for general Markov-Feller processes. Next, we consider two versions of constrained problems on the final state or on the final time. The plan is as follows:
متن کاملMultiple Stopping Time POMDPs: Structural Results & Application in Interactive Advertising in Social Media
This paper considers a multiple stopping time problem for a Markov chain observed in noise, where a decision maker chooses at most L stopping times to maximize a cumulative objective. We formulate the problem as a Partially Observed Markov Decision Process (POMDP) and derive structural results for the optimal multiple stopping policy. The main results are as follows: i) The optimal multiple sto...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000